Thermodynamics of nano-spheres encapsulated in virus capsids 
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We investigate the thermodynamics of complexation of functionalized charged nano-spheres with 
viral proteins. The physics of this problem is governed by electrostatic interaction between the 
proteins and the nano-sphere cores (screened by salt ions), but also by configurational degrees of 
freedom of the charged protein N-tails. We approach the problem by constructing an appropriate 
complexation free energy functional. On the basis of both numerical and analytical studies of this 
functional we construct the phase diagram for the assembly which contains the information on the 
assembled structures that appear in the thermodynamical equilibrium, depending on the size and 
surface charge density of the nano-sphere cores. We show that both the nano-sphere core charge as 
well as its radius determine the size of the capsid that forms around the core. 
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INTRODUCTION 



Viruses have optimized the feat of packaging genome 
molecules and delivering them to the appropriate cells. In 
its simplest form, a virus consists of a rigid protein shell 
(capsid) that surrounds and protects the genetic mate- 
rial (cither RNA or DNA) from chemical and physical 
assaults^. A viral capsid contains several copies of either 
one type of protein or of a few slightly different kinds. A 
number of in vitro self-assembly experiments reveal that 
the protein subunits of many RNA viruses can assemble 
spontaneously not only around their own genome but the 
genomes derived from other viruses and various non- viral 
anionic polymers. All these features in addition to their 
extraordinary highly symmetric shape and monodisperse 
size distributions make viruses ideal structures for gene 
therapy, drug delivery and various nanotechnology and 
material science applications^!^. To this end, the num- 
ber of experiments and theoretical research aimed at un- 
derstanding the physical basis of assembly of viruses and 
the factors influencing the structure and size of viral cap- 
sids is amazingly soaring^^^I^d^^Hi^s.ig^ 

The majority of viral capsids have either spherical or 
elongated structures. Most spherical viruses have struc- 
tures with icosahedral symmetry and contain 60T protein 
subunits where T is the structural index of viral shells 
and is determined from the relation T = h 2 + k 2 + kh 
with h and k nonnegative integers^. While the capsid 
protein of some viruses can form only one structure with 
a specific size, the capsid protein of many others are more 
flexible and adopt various structures with different sizes. 



For example, tobacco mosaic virus (TMV) capsid pro- 
teins assemble into tubular structures regardless of the 
shape and size of their cargo^. This indicates that the 
shape of the virus is solely dictated by the intrinsic prop- 
erty of protein subunits. In the other end of the spec- 
trum, many experiments show that the capsid proteins 
of cowpea cholorotic mosaic virus (CCMV), a spherical 
RNA virus, are able to form capsid of various size and 
shapes^i^. 

Over 35 years ago, Adolph and Butler— and more re- 
cently Lavelle et al£& performed a series of in vitro ex- 
periments with CCMV capsid proteins in the absence 
of RNA and found that depending on the pH and ionic 
strength several different structures assembled. A no- 
table feature of the constructed shape "phase" diagram 
based on these experiments is the change from an icosa- 
hedral T — 3 structure to a cylindrical shape upon a 
decrease in ionic strength and an increase in pH reveal- 
ing the important role of electrostatic interaction on the 
size and shape of empty viral shells. 

There have been a number of different experiments 
and theoretical s t u dies2§i2L2&22i22i2ii22i2iL to investigate 
the impact on the structure of capsids of the shape and 
length of genome. To explore the effect of cargo on the 
morphology of viral capsid, Zlotnick and coworkers ex- 
amined the assembly of capsid proteins of CCMV around 
heterogeneous DNA longer than 500 base pairs and found 
that tubular structures spontaneously formed^. Note 
that CCMV has at T = 3 icosahedral structure in its 
native form. 

Quite remarkably, almost half a century ago, Bancroft 
and coworkers (see e.g. Ref. [23h demonstrated the im- 
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portant role of stoichiometry ratio of capsid proteins and 
genome in the structure of viral shells. According to 
their experiments CCMV capsid proteins could encap- 
sidatc TMV RNAs which is about 6000 nucleotides in 
viral particles of various sizes. Depending on the ratio 
of RNA-protein concentrations, T — 3, T = 4 or T = 7 
structures can form. 

More recent experiments with PSS (polystyrene sul- 
fonate, a highly flexible polyelectrolyte chain) and 
CCMV capsid proteins indicate the significance of length 
of genome in that the size of CCMV viral shells encapsi- 
dating PSS vary from 22 nm to 27 nm when the molecular 
weight of PSS varies from 400KDa to 3.4 MDa-ii. The im- 
pact of the size of cargo on the diameter of capsid is quite 
transparent in the experiments of Dragnea and coworkers 
in which they found that the capsid proteins of Bromo 
Mosaic Virus, another spherical plant virus, are able to 
package the functionalized gold nano-spherical particles 
of diameters of 6, 9 and 12 nm to form virus like par- 
ticles (VLP) with T = 1, T = 2, or T = 3 structures, 
respectively^. 

All the aforementioned experiments focus on the im- 
portant role of size and structure of genome and stoi- 
chiometric ratio of genome to protein in determining the 
structure of viral shells. However, several experimental 
and theoretical studies reveal that electrostatic interac- 
tion is the driving force for the assembly of capsid pro- 
teins around anionic cargos, and thus it is crucial to study 
the impact of cargo charge density on the structure of 
capsids. In fact, a careful study of several single stranded 
RNA viruses show that there is a linear relation between 
the number of charges on the capsid inner surface and on 
their genome^. An important question, then, naturally 
arises: Could we change the size of a capsid by changing 
the net charge of its cargo? More specifically, in the ex- 
periments of Dragnea and coworkers with nano-spheres, 
does a T = 3 structure form, if one increases the charge 
density of the 9 nm cores which normally form " pseudo" 
T = 2 structures? 

In this paper, we investigate the interplay between the 
charge density and size of nano-cargos in virus assembly. 
Similar to the experiments of Dragnea and coworkers, we 
consider negatively charged nano-spheres which interact 
with positively charged capsid inner surface, under phys- 
iological condition and find that in addition to the di- 
ameter of the encapsidated nano-spheres, the total net 
charges on cargos have a significant impact on the size of 
viral capsids. 

An important feature of several RNA viruses, includ- 
ing CCMV and BMV mentioned above, is the presence 
of cationic polypeptide chains that form the N-termini of 
the capsid protein. Rich in basic amino acids, there is 
a total of thousands of charges on the N-terminal tails 
which extend into the capsid interior and are responsi- 
ble for the absorption of RNA to the inner capsid sur- 
face. Very recent in vitro studies of Aniagyei et al. re- 
veal that a mutant of CCMV coat proteins lacking most 
of the N-terminal domain, AA34, assemble around neg- 



atively charged 12 nm spherical cores to form T = 2 
structures^. Note that native CCMV proteins form a 
T = 3 structure around 12 nm spherical cores. Our cal- 
culations also show that the N-terminal arms can have 
a major impact on the virion structure and (as shown 
in Fig. 3 and 4), they can significantly modify the free 
energy landscape of viral structures. One also has to 
consider that the deletion of the N-terminal tails might 
change the preferred angle between the protein subunits, 
an effect which is not taken into account in the present 
study. According to our studies, depending on the cargo 
charge density and the presence of N-terminal tails it 
might be advantageous for capsid proteins to form rela- 
tively smaller or bigger shells compared to their native 
structures. The effect of N-terminal on the free energy 
of viral capsids has been investigated previously^. Here 
we take another approach that enables us to study the 
energetics of complexation of proteins and core in more 
details. Our emphasis is also on different aspects of the 
assembly, in particular the formation of differently sized 
structures depending on the conditions. 

The outline of the paper is as follows. In Section II, we 
present our model to calculate the free energy between 
capsid inner surface and a rigid sphere including the in- 
teraction of positively charged N-terminal tails with the 
spherical cargo. In Sec. Ill, we present our numerical 
results and in Sec. IV we discuss our findings and their 
implications, and summarize our conclusion. 



II. THEORETICAL DESCRIPTION OF 
ENERGETICS AND THERMODYNAMICS OF 
THE ASSEMBLY 

Here, we consider encapsidation of charged nano- 
particles (whose number is n c ) within the virus capsid 
and the dependence of the formation free energy on the 
parameters describing the system. We assume that the 
solution consists of monovalent salt (of bulk concentra- 
tion Co), dissolved protein monomers (or, more gener- 
ally, basic protein subunits, which may be e.g. protein 
dimers as is the case for hepatitis B virus - their num- 
ber is assumed to be n p ) and spherical cores that are 
perfectly monodispcrse with respect to radius R\ and 
surface charge density a\ . All the particles are dissolved 
in a medium whose dielectric constant is eoe r (we shall 
take e r — 80, i.e. water). An assembly problem of this 
type involves many parameters, including the concentra- 
tions of cores (n c /V) and proteins (n p /V, where V is 
the volume of the solution). These two parameters im- 
portantly influence the assembly phase diagram, in ad- 
dition to the energy of the assembled complex, which is 
a fairly complicated quantity in itself. One can expect 
formation of variously sized protein capsids (described 
by different Caspar-Klug T numbers) around the core, 
depending both on the properties of the core but also on 
particle concentrations (n c /V and n p /V). We shall first, 
however, examine the energetics of the formed complex 
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(capsid + core). 

A. Energetics of the assembled complex 

In general, we shall assume that the protein charge 
is distributed along the flexible N-tails. We treat 
the tails as generic polyelectrolytes with intermonomer 
bondlength a, partial charge p per monomer, and non- 
electrostatic excluded volume interactions characterized 



by the excluded volume v. Our approach is quite simi- 
lar to that exposed in Ref. l32|. The free energy of the 
complex in the subspace of fixed total number of poly- 
electrolyte monomers, N, can be calculated from 

F = J f(r)d 3 r -fx (J d 3 r\^(r)\ 2 - N^j , (1) 

where p is the Lagrange multiplier enforcing the condi- 
tion of fixed number of monomers, and 



f(r) 
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(2) 



The free energy is a functional of the monomer den- 
sity field of the polyelectrolyte chains (\P 2 (r)) and the 
mean electrostatic potential ($(r)). The complex free en- 
ergy also depends on the salt concentration fields (c ± (r)) 
whose chemical potentials are denoted by /i ± . One can 
assume that in addition to the charge located on mobile 
protein tails, there is also a static density of charge, p P {r), 
which could in principle be located on the immobile parts 
of the capsid proteins (outside tails). In the further cal- 
culations, we shall neglect the fixed charge on the capsid 
and assume that the static charge in the system resides 
exclusively on the core surface, so that 



p p (r) = a\8{r - Ri), 



(3) 



where <j\ is the charge density at the core surface. The 
contribution of such localized charges to the electrostatic 
part of the free energy can be easily separated (if re- 
quired) from the functional in the form of the boundary 
term as it was done in Ref. HH. See Fig. Q] for the illus- 
tration of the system that we consider and the relevant 
parameters characterizing it. 

The variation of the free energy functional with re- 
spect to fields 3>, and yields two coupled partial 
differential equations. The first one is the generalized 
polyelectrolyte Poisson-Boltzmann equation, 

e eV 2 $(r) = 2ec sinh[/3e$(r)] - ep*(r) 2 + p p (r). (4) 

The second equation is the Edwards equation, 



— V 2 *(r) = w*(r) 3 +pe/3$(rW(r) - u\P(r). (5) 
6 

The Lagrange multiplier (p) enforces the conservation of 
the total number of polyelectrolyte monomers, 



J d 3 r* 2 (r) = N p N t = N, 




FIG. 1: Illustration of the portion of the spherical complex of 
proteins with flexible tails and a charged core. The param- 
eters discussed in the text are indicated. The configuration 
of the tails corresponds to the case when the core is poorly 
charged. 



where N t is the number of monomers in a particular tail, 
and N p is, as before, the number of proteins (or, in gen- 
eral, protein subunits) in the complex. The electrostatic 
boundary conditions are the same as in the standard PB 
theory. In particular, there is a boundary condition at 
r = R\ , reflecting the finite charge density, o\ at the sur- 
face of the core. There arc two additional conditions that 
need to be specified for the density field ^(r). We simply 
take the core to be impenetrable and thus presume that 



(6) 



(7) 
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On the inner part of the capsid one could assume that 



* CR 2 ) = 



AiraR 2 ' 



(8) 



which means that there must be some density of the 
tails at the capsid wall since they are fixed or grafted. 
This is the only way that grafting enters our calculation. 
This condition signifies that in the volume AiraR 2 of the 
spherical shell, touching the interior of the capsid with 
its larger radius, there should be one monomer per each 
of the tails in the structure. In the continuum limit, this 
means that the polyelectrolyte anchors to the capsid per- 
pendicularly (see Fig. [T|). 

The procedure presented thus far is based on the mean- 
field description of the electrostatics of the problem and 
ground-state dominance ansatz for the polyelectrolyte 
field, see Ref. |32| for details. Furthermore, the finite 
extensibility of the polyelectrolyte is not taken into ac- 
count. While that was of no essential importance for the 
problem studied in Ref. [32l . it may become of impor- 
tance in our case, since the protein tails are assumed to 
be grafted to the capsid. To take the geometrical con- 
straints regarding the polyelectrolyte density field into 
account, we shall now derive an alternative set of equa- 
tions that we will obtain from a constrained variation of 
a free energy functional. Instead of varying F comp i ex over 
the space of all functions \&(r), we shall vary it over the 
constrained set of functions, i.e. polyelectrolyte ampli- 
tudes that can be represented as 



ty(r) = * a (r) + u 2 (r), 
where u(r) is a real function and 



*2(r) 



Airar 2 



(9) 



(10) 



The above equation describes the maximally extended 
polyelectrolyte density, so that no smaller values of \&(r) 
are possible without enlarging the monomer- monomer 
separation distances a (stretching). Varying F comp i ex 
over it(r), we obtain the new set of Euler-Lagrange dif- 
ferential equations for $(r) and u(r) as 

e eV 2 $(r) = 2ec sinh[/3e$(r)] - ep[* s (r) + u 2 (r)} 2 , 

(11) 

(12) 



and 



G 



'£*J u ( r )] = s ( r ) 



where C^, s (it) is a differential operator given as 

£* a (u) = VuW s + 2[u 2 V 2 u + 2u{Vu) 2 ]. (13) 

We have used here V 2 ^ = 0. Function s(r) is given as 

s(r) = v(^ 3 s u + 3^ 2 u 3 + 3* s u 5 + u 7 ) 

+ /3e$(* s w + u 3 ) -2tt/x(* s +u 2 ). (14) 



We still have to specify the boundary conditions. Look- 
ing at Eq. ©, one sees that we can no longer put 
^(i?i) = 0, so we will be necessarily stuck with finite 
density of monomers at the core radius. We choose 



u{Ri) = 5, u(R 2 ) = 5, 8 -> 



(15) 



as the appropriate boundary conditions implying that the 
N-tails are normal to the surface of the inner core as well 
as outer capsid wall at R\ and i?2- In other words we 
assume that the chains have no overhangs at R\ and i?2 , 
so that they touch both the core and the capsid only once 
and perpendicularly. Note that Eq. ((SJ is automatically 
satisfied with such a choice. 

The complex free energy that we have constructed 
thus far accounts approximately for electrostatic energy 
of the system, the entropic and excluded volume effects 
of the confined polyelectrolyte (protein tails), but also 
for entropic contributions of the salt ions (on the mean- 
field level). It does not contain, however, the attractive 
component (non-electrostatic) of protein-protein interac- 
tions. These interactions consist of hydrophobic and van 
der Waals contributions 35 and we denote their value per 
protein in the complex as f p .hydro- H 1S this part of 
the energy that keeps the dominantly positively charged 
empty capsids together— 



B. Thermodynamics of the assembly 

The parameter space for assembly that we are in- 
terested needs to be at least four-dimensional (Ri, o~\, 
n c /V , and n p /V), presuming that the properties of the 
capsid proteins are kept fixed. Analysis of assembly in 
such a high-dimensional parameter space would be highly 
involved.— 121 It is of interest thus to try to extract the 
relevant information on the assembly solely from the com- 
plex free energy, as we have defined it. Such a procedure 
cannot be expected to be valid for all values of n c and 
n p , but it should be of use when 



n c < 



-^p(Tnax) 



(16) 



where T m ax is the maximal capsid T-number that can 
be possibly assembled in the experimental circumstances, 
and N p {T max ) is the number of proteins (subunits) in a 
Caspar-Klug structure of that T-number. Basically, Eq. 
([11]) says that, concerning the "final" (assembled) state, 
there is no big difference with respect to the T number 
of the dominantly assembled structures. The final state 
will always consist of all cores complexed with certain 
number of proteins, and the rest will be, more or less the 
same number of proteins that may remain isolated in the 
solution or perhaps form empty capsids. There will be 
no free (uncomplexed) cores in the assembled state. 

Let us assume that prior to the assembly, the solution 
contains isolated charged cores and individual viral pro- 
teins (or protein dimers or whatever the basic subunit of 
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the assembly might be) - this is the "initial" state. In 
the " initial" state, the free energy of the system is 



F% Wpf p "T" ^c-^co 



(17) 



where /„ and F % core are the free energies per isolated pro- 
tein and core in the inital state, respectively. These may 
in principle contain also the translational entropy contri- 
butions. After the assembly, the free energy is ("final" 
state) 

Ff n c F cornp [ ex ~t~ ^c-^pf P: hydro v^p Np)fp- (18) 

The first term on the RHS of Eq. (fl8|) is what we can 
calculate - this is the free energy of the complex (note, 
however, that we do not calculate the entropic term re- 
garding the translational freedom of the assembled struc- 
tures). The number of complexes is exactly n c since we 
assumed that all cores are complexed with the proteins. 
The second term on the RHS of Eq. (fTS)) is the part of the 
attractive energy of proteins assembled in the complexes 
that is difficult to calculate. It contains the hydrophobic 
and van der Waals protein-protein interactions, and per 
protein, the corresponding free energy is f Pi hydro- The 
third term on the RHS of Eq. (|18[) is the free energy of 
the proteins in the final state that are not assembled in 
the complexes with cores {n p — n c N p of them). They may, 

however, be assembled in empty capsids. Thus, in gen- 

— / 

eral, free energy per such protein in the final state (f p ) 
is not the same as the free energy per isolated, individual 

protein in the initial state (f p ). If the proteins that do 
not complex with the cores indeed form aggregates (e.g. 

empty capsids) then one can expect that /„</!. 

The system will proceed from state i to state / if 
Ft < Ft. We want to examine the quantity Ff for com- 
plexes that have different T numbers, i.e. we want to 
find Ff for several different final states /. For all of 
these states, the inital assembly state is the same, so for 
two different final states f± and f 2 we can directly com- 
pare the corresponding free energies Ff t and Ff 2 and if 
Ff 1 < Ff 2 we can say that state /i is more likely to be 
realized in the thermodynamical equilibrium. It is plau- 
sible to assume that 

\F C omplex\ 3> N p \J 'p frydro ~ /pi' C^) 

and if this is indeed the case, then we can examine 
Fcompiex for different complexes and compare them mu- 
tually (for given "initial" state, i.e. charge density and 
radius of the cores) so to judge about the thermodynam- 
ically preferred states. In order to be able to construct 
a phase diagram, i.e. to compare the free energies for 
different initial states, we shall construct the quantity 

AF — F c07np i ex F core . (20) 

This quantity should contain the biggest part of the as- 
sembly free energy, assuming Eq. (| 19[) holds. In the 
following, we shall term AF as the assembly free energy, 
and we shall refer to F comp i ex (also denoted as F) as the 
complex free energy or the free energy of the complex. 



III. NUMERICAL EVALUATION OF THE 
MODEL 

A. Tailless protein subunits 

As already discussed, a prominent feature of the virus 
proteins assembled in a capsid are the N-tails that pro- 
trude into the capsid interior. This feature is typical 
for many viruses and can influence both the energetics 
of the protein-genome assembly^ and its speed. The 
tails are typically very positively charged, and they are 
thus expected to play a prominent role in the assembly 
of proteins with negatively charged cores. The possibil- 
ity of spatial redistribution of the tails is also expected 
to influence the assembly, so that one can expect an in- 
terplay between the electrostatics of the tails and their 
configurational entropy. All of these effects are included 
in our free energy functional, at least approximately. In 
this section, we shall emphasize the electrostatic aspect 
of the capsid proteins and effectively neglect the tail po- 
sitional degrees of freedom. In order to do so, we treat 
the N-tails as infinitely rigid with the vanishing inter- 
monomer bondlength a. The problem then reduces to 
electrostatic interactions only since the entropy of the N- 
tails is quenched. Omitting the hydrophobic, vdW etc. 
energy of the capsid proteins, one is left with a purely 
electrostatic part of the total free energy. 

As a first approximation one can smear the charge of 
the N p proteins uniformly over the external sphere of ra- 
dius R 2 , so that its surface charge density is a 2 . The 
problem as described by this model system is stil not 
completely trivial as the Poisson-Boltzmann (PB) equa- 
tion describing it is of course nonlinear—. However, some 
insight can be obtained by linearizing the PB equation 
and solving it in this approximation (Debye-Hiickel, DH). 
The details are elaborated in the Appendix and the fi- 
nal result for the electrostatic free energy is equation 
(|52")l . One can further simplify it by using kR\ ^> 1, 
which is a condition typically met for the experiments 
done on viruses at physiological salt concentrations (~ 
100 mM)^i. In this case, 



lilll Fcompiex 
k_Ri»1 



+ AR 1 R 2 o- x o- 2 e< R i- R ^ + Rlo-le*< R '- R ^ 
+ R\°l)- (21) 

One should note here that the salt resides only in com- 
partments III and II (see Fig. []}, so there is no symmetry 
in the formula regarding a\ , a 2 and R\ and R 2 . The first 
term in Eq. (j2Tj) is the electrostatic self-energy of the 
core, and the third term is the self-energy of the protein 
shell. Note that the self-energy of the core has a prefactor 
of 2 with respect to the analogous term for the protein 
shell. This is due to the fact that the shell is screened 
by the salt both from the inside and the outside, which 
is not the case for the impenetrable core (note also that 
the dielectric constant of the core does not figure in the 
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final equations). 

The second term in Eq. [5T] is the electrostatic interac- 
tion free energy between the core and the protein shell. 
One easily sees that it will be minimized when o~\ and a? 
have different signs. We see that it decreases quickly as 
the distance between the shell and the core increases, i.e. 
it decreases as exp[— k[R,2 — Ri)]- 

Though the DH solution enables us to study the ener- 
getics of the assembly in more details, at least numeri- 
cally one needs to check its validity vs. the complete non- 
linear Poisson-Boltzmann theory, see Ref. [H. By com- 
paring it to the exact solution of the Poisson-Boltzmann 
equation one simultaneously checks the numerical results 
and the analytical formula for the DH solution. This 
comparison is shown in Fig. [5] 
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FIG. 2: Comparison of the numerically exact results with 
the analytic DH expression (|32|) for the free energy of the 
capsid with the charged core (complex free energy). The free 
energy per protein subunit is plotted as a function of the core 
surface charge density cri. The parameters of the calculation 
are denoted in the figure. 

We have chosen the capsid parameters to approxi- 
mately represent the T=3 capsid of the BMV (bromc 
mosaic virus) with capsid radius R2 — 16 nm. The sur- 
face charge density of the capsid cr 2 = - 0.504 e/nm 2 
was obtained by smearing the 18 charges per each of the 
N p — 90 dimeric tails over the capsid surface. Note that 
the absolute signs of the charge don't matter; what is im- 
portant is that the charge on the core is of the opposite 
sign from the charge on the capsid. The core radius Ri 

- 15.125 nm was chosen to be in the regime when the 
attractive interaction is not completely screened by salt 

- one can see this effect as a minimum in the free energy 
for some core charge density a±. This minimum disap- 
pears when the distance between the core and the capsid 
is larger than ~ 1/k. In the chosen case, the minimum 
is at (7i ~ 0.3 e/nm 2 . 

As in the case studied in Ref. UH, the DH results al- 
ways produce larger free energies than the exact Poisson 

- Boltzmann calculation. They are of course better when 



the potentials in the solution are small enough, so that 
the linearization approximation holds - this is the case 
for not too large surface charge densities, and it can be 
clearly seen that the DH approximation fails worse as <j\ 
(or (72 ) increases, again in complete agreement with the 
results of Ref. Oj. 

Having established some confidence in the DH results, 
we can scrutinize them a bit more closely. If one takes 
Ri = i?2 and o\ — — 02 in Eq. (f2~lj) , one obtains the ab- 
solute minimum of the complex free energy in the whole 
i?i,i?2jCi and (72 space. At that point in the param- 
eter space the free energy is exactly zero which is the 
absolute minimum. This is due to the fact that the core 
charge exactly neutralizes the protein charge yielding ef- 
fectively the uncharged shell (a = o\ + 02 = 0) of radius 
R = Ri = i?2- Note, however, that the assembly free 
energy is not bounded from below. 

We now insert Eq. (f2"Tj) in Eq. (|2"0"]) and find 



<7l(72e 



K(fll-fl 2 ) 



C() 

R\o\e 2 ^-^) 



R 



'Pi] 



(22) 



Examining the assembly free energies (Eq. (|2"0"]) i of the 
variously sized (i?i) and charged (cti) cores with the pro- 
teins assembled in capsids of three different T number, we 
obtain the exact results for the proteins with no tails are 
shown in Fig. [3] These were obtained by solving full PB 
equation, without linearizing it. We see that the phase di- 




FIG. 3: Numerically exact results for the assembly free energy 
(Eq. I20|) assuming tailless protein subunits (co = 100 mM). 
The free energies were plotted for capsids with three different 
T-numbers as functions of the core surface charge density o\ 
and radius R\ . The number of capsomeres and corresponding 
capsid radii are iV p =30 (T=l), 60 (T="2"), and 90 (T=3), 
and Ri = 9.24 (T=l), 13.06 (T="2"), and 16 nm (T=3). 

agram in this case is totally " flat" - the smallest possible 
T-number structure will always be the one with smallest 
complex energy, irrespectively of the core charge and its 
radius (at least in the range of values considered here, and 
the conditions summarized in III Bp . Thus, the phase di- 
agram is governed solely by simple geometry. The same 
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result comes out also in the DH approximation. Note 
also how the assembly free energy shows practically no 
dependence on o\ and R\ when R2 — R\ 3> l/n. This 
is due to the fact that in this regime, the assembly free 
energy is simply the (positive) electrostatic self-energy of 
the capsid which depends only on R2, i.e. the T num- 
ber of the capsid (see also Eq. (|22|) for the Debye-Huckcl 
description of this situation). When Ri — R\ ~ 1/k (i.e. 
about 1 nm in our case, c = 100 mM), the electrostatic 
attraction between the core and the capsid becomes only 
partially screened by salt ions, and the free energy of the 
assembly, AF, becomes negative suggesting that the as- 
sembly is thermodynamically preferrable, releasing extra 
energy in the solution. The assembly is also more efficient 
for larger core charge densities. 

We still need to consider what happens in the case 
when the protein charges are delocalized on the flexi- 
ble polyelectrolyte tails. In particular, the decrease of 
the volume of the space between the core and the cap- 
sid when Ri — > R 2 would importantly confine the poly- 
electrolyte tails. We can thus expect their entropic and 
self-interaction contributions to become of the largest im- 
portance in the region where the assembly free energy of 
the tailless monomers shows the most negative values. 



B. Protein subunits with N-tails 

We now assume that the protein charge is distributed 
along the flexible N-tails, i.e. we take all the details of 
the model developed in Sec. |TT]into account. First, we 
calculate the free energies of the complex without the ac- 
count of the finite extensibility of the tails, i.e. we solve 
Eqs. (H|) and (0 with boundary conditions for the poly- 
electrolyte amplitude as specified in Eqs. ([7]) and ([5]). 
From the thus obtained free energy, we subtract the elec- 
trostatic self-energy of the core. The ensuing assembly 
free energies are shown in Fig. [4] 

We see that although the assembly free energies are 
of the same order of magnitude as in the case of tailless 
capsomeres, the shape of the free energy landscape is 
quite different and the assembly is now governed both 
by the core charge and by its radius. This is most easily 
seen by the complicated shape of the regions denoted by 
T = 1, T = "2", and T = 3 in the o x - R x plane in 
Fig. [U which correspond to capsids with the lowest free 
energy. We also see that when the two radii are close 
to each other, the free energy steeply rises. This is easy 
to understand, since in this case, the tails are forced to 
redistribute in a small volume in between the capsid and 
the core, so the contribution of entropic confinement to 
the free energy becomes significant. Interestingly, in the 
case of an infinitely thin capsid studied in the previous 
section, it was in these regions that the assembly free 
energy sharply dropped, exactly the opposite from the 
case we have here. It is thus clear at this point that the 
tails do introduce different physics into the problem. The 
assembly free energy in general also decreases with a± 




FIG. 4: Assembly free energies calculated as functions of the 
core radius Ri, its charge o\, for three different capsid radii, 
R2 (and N p ), as quantified by three different T numbers. The 
parameters of this calculation are a = 0.5 nm, p = 1, v=0.5 
nm 3 , c = 100 mM, N t = 18. 



which is an effect that we saw in the previous subsection 
and is due solely to electrostatics. We also observe that 
the values of the free energy are mostly smaller from the 
ones obtained in the model of an infinitely thin charged 
capsid, which essentially means that the tails can adopt 
such conformations that reduce the electrostatic part of 
the free energy to a significant extent, especially in the 
electrostatically unfavorable regime. Note also how the 
structure with the lowest assembly free energy increases 
its radius (and total charge) as the core charge increases 
(e.g. for i?=8.25 nm one can see the progression from 
T = 1. T = 2 to T = 3 structures as o\ increases). 
This can be simply explained by the screening of the 
core by the capsomer tails - the more charged the core, 
the more capsomeres (i.e. larger T numbers) are needed 
to screen it efficiently But note here that the tails are 
assumed to be maximally flexible and can thus easily 
stretch from practically arbitrary distances (capsid) to 
the core in order to screen it. For given core radius, R\, 
the assembly free energy is positive when u\ = 0, and 
it decreases as o~\ increases, becoming negative for some 
" critical" core charge density, whose typical values in the 
range of parameters considered are o\ ~ e/nm 2 . Thus, 
in order for the assembly to proceed spontaneously, the 
cores need to be sufficiently charged. 

To see whether the results are influenced by the maxi- 
mal extensibility of the tails, in Fig. [5]we plot the assem- 
bly free energy using the maximal extensibility ansatz in 
Eq. (flC)]) together with boundary conditions in Eq. (j 1 5[) . 
Again, the assembly free energies are of the same order of 
magnitude, but the borders between the different regions 
in the a\ — R\ plane are quite different with respect to 
those obtained without the maximal extensibility ansatz. 
In order to better understand the results obtained thus 
far, it helps to plot the spatial distribution of the protein 
monomers, i.e. \l/ 2 (r), within the space between the core 



8 



T=3 




FIG. 5: Assembly free energies as functions of the core radius 
Ri, its charge cri, for three different capsid radii, R2 (and N p ), 
as quantified by three different T numbers. The maximal 
extensibility ansatz was used to obtain these results. The 
parameters of this calculation are a — 0.5 nm, p = 1, v=0.5 
nm 3 , c = 100 mM, N t = 18. 



and the inner capsid radius. A solution for the case when 
N p = 90 (1620 monomers in total), R 2 = 16 nm (T =3), 
i?i = 9.875 nm, o\ = 0.5 e/nm 2 is shown in Fig. [SJ 



0.9 



unconstrained variation, v F(R,)=0 



constrained variation, u(R,)=0 + 




12 13 
r [nm] 
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16 



FIG. 6: Comparison of the unconstrained monomer density 
(thick dashed line) and constrained monomer density (full 
line) approaches. The parameters of the calculation are o\ = 
0.5 e/nm 2 , c = 100 mM, i?i = 9.875 nm, R 2 = 16 nm, 
iV p = 90, N t = 18. The maximal extensibility limit [Eq.lfTu}] 
is denoted by a thin dashed line. 

What this figure nicely illustrates is that the tails 
stretch out from the capsid surface towards the core so 
that they accumulate around the oppositely charged core. 
There are still some monomers in the space between the 
core and the capsid, but the dominant density is situ- 
ated in a shell around the core. One can also see how 
the constrained solution stays above the maximal exten- 
sibility limit in Eq. (|10p . One should keep in mind that 
the boundary conditions for the two solutions are differ- 



ent, so one cannot expect that the constrained variation 
will always yield larger energy. This is also the case for 
the displayed calculation where we find F — 927 ksT, 
and F — 788 ksT for the unconstrained and constrained 
values of the complex free energy, respectively. Note how 
the constrained polyelectrolyte density in fact approaches 
closer to the charged core than the unconstrained density 
due to the different boundary conditions that the two sat- 
isfy. The noted effect results in the lowering of the free 
energy in the constrained case. However, this is not al- 
ways the case, and it depends on the distance between 
the core and the capsid (R2 — Ri)- When this distance is 
sufficiently large, the tails cannot accumulate around the 
core even when they maximally stretch, so that they can- 
not screen the core efficiently. This effect is not present 
in the calculation with the unconstrained polyelectrolyte 
amplitude. This is in fact the most important reason for 
the different look of boundaries in the a\ — R\ plane for 
the two calculations. Note how in the unconstrained case, 
the complex of core with T = 3 capsid has the lowest en- 
ergy for sufficiently large a\ (p\ > 1.5 e/nm 2 ) in a huge 
range of radii Ri (8-16 nm, see Fig. @|. However, the 
maximal length of the (fully stretched) tails is N t a — 9 
nm, so that for R\ — 8 nm tails are very much extended 
and only a small part of the polyelectrolyte density can 
gather around the core. As the unconstrained results 
do not account for this effect, by breaking the maximal 
extensibility limit in Eq. (fTTJ|) the polyelectrolyte density 
screens the core efficiently and thus lowers the free energy 
of the T = 3 structure with respect to the value obtained 
in the constrained calculation. This effect is illustrated 
in Fig. [7] for T = 3 {R 2 = 16 nm, N p = 90), -Ri = 8 nm, 
(Ti = 1.5 e/nm 2 . Note how the maximal extensibility 
limit is severely broken in the unconstrained calculation. 
The free energies of the complexes are F — 2710 ksT, 
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FIG. 7: Comparison of the unconstrained monomer density 
(thick dashed line) and constrained monomer density (full 
line) approaches. The parameters of the calculation are o\ = 
1.5 e/nm 2 , c = 100 mM, R x = 8 nm, R 2 = 16 nm, N p = 90, 



Nt = 18. The maximal extensibility limit [Eq. (|10|l ] is denoted 
by a thin dashed line. 
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and F = 3520 ksT in the unconstrained and constrained 
calculations, respectively, demonstrating the effect dis- 
cussed. 

It is of interest to examine the efficiency of core screen- 
ing by the polyelectrolyte tails somewhat closer. We de- 
fine the ratio 



e 



N t N p 



(23) 



which can be interpreted as a percentage of monomers 
in a shell of thickness a (monomer-monomer separation) 
around the core. This can also be thought of as the per- 
centage of monomers that cover (are "in contact with") 
the core. In Fig. [5] we display the coverage as a func- 
tion of a i and R\ for T = 3 capsid (i?2 = 16 nm). 




FIG. 8: Coverage of the core defined by Eq. ([23} as a function 
of o\ and Ri for a complex with T—3 capsid (R2 = 16 nm) . 
The calculation was performed with the maximal extensibility 
constraint on the polyelectrolyte amplitude. 



What can be easily seen from this plot is the grad- 
ual shift of the polyelectrolyte density from the space 
close to capsid to the shell surrounding the core as the 
core charge density (01) increases. Due to the maxi- 
mal extensibility constraint that was included in these 
calculations, O parameter saturates at a value smaller 
than 1 as <T\ increases. This simply reflects the fact that 
the polyelectrolyte must pass through the region between 
the core and the capsid in order to accumulate around 
the core, and so much of its density may remain in the 
space in between the capsid and the core. This effect 
becomes more important when (Ri — Ri) becomes com- 
parable to the tail length, N t a. The charge density of 
the core at which one observes the significant accumula- 
tion of the monomers around the core {a\ ~ 0.5 e/nm 2 ) 
is the same as the charge density at which the assembly 
free energy becomes negative, i.e. the assembly proceeds 
spontaneously. 



IV. CONCLUSIONS 

An intriguing feature of our results is that both the 
core charge and its radius determine the size of the cap- 
sid around the core. A particularly interesting case is 
when the core radius is close (but somewhat smaller) to 
the T=l capsid, i.e., R\ =8 nm. For sufficiently small 
core surface charge density {a\ < 0.5 e/nm 2 ), T = 1 
structures around the cores shall form. However, if the 
surface charge density increases over some critical value 
(around 1.0 e/nm 2 in the constrained model of the poly- 
electrolyte), T="2" capsids shall form, in spite of the 
fact that the core radius is more than 5 nm smaller from 
the radius of T="2" capsid (see Fig. This clearly 
shows that in addition to core radius, which dominantly 
influences the assembly process, one needs to have an ad- 
equate charge density in order to produce the structures 
of desired T-number. The same effect is present on the 
T ="2" and T = 3 border when R x > 11.6 nm. We show 
this transition region in another way in Fig. [HI Note that 




ct 1 [e/nm 2 ] 

FIG. 9: Free energy of the complexes for three different core 
radii in the transition region as a function of core surface 
charge density. The dotted and full lines display the results 
for T=3 and "2" complexes, respectively. The parameters 
of the polyelectrolyte are the same as before and the cal- 
culations were performed for the constrained polyelectrolyte 
model (maximal extensibility limit obeyed). Grey regions de- 
note the surface charge densities 01 where the T=3 structure 
has lower free energy. 
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although the free energy curves indeed cross for certain 
values of o\ and Rx, their magnitudes remain quite sim- 
ilar even deeply in the transition region. Thus one could 
expect to observe a polydisperse distribution of T="2" 
and T = 3 structures in a solution with a monodisperse 
distribution of core size and charge density. 

An interesting effect is observed for Rx — 12.075 nm in 
Fig. [5] where there exists two intersections between the 
free energy curves for T="2" and T—3 complexes as <Ji 
increases. If the charge density is lower than 0.3, T=3 
structures have the lowest assembly free energy. Quite 
surprisingly, upon increasing the charge density, T = 2 
structures become the dominant ones, i.e. the size of 
thermodynamicaly preferred capsids decreases. This is 
mainly due to the fact that with increasing the charge 
density, the monomers in the N-terminal tails prefer to 
sit next to the core, i.e., the electrostatic interaction wins 
over the chain configurational entropy. However, as we 
increase the core charge density beyond 0.7, it becomes 
more advantageous to have T = 3 structures again as 
there will be more charges associated with the N-tails of 
T=3 structures. 

An even more interesting effect is illustrated in Fig. 
9 for i?i = 11.7 cores. Here, we observe three transi- 
tion lines instead of two of the previous case. Somewhat 
counter- intuitively, for the charge densities above 1.3, 
T="2" particles become free energy minima structures 
again. This indicates that under physiological conditions, 
if we increase the charges on the cores significantly, the 
lowest possible T structures (T = 2 in this case) form 
so that more charges on the N-tails can sit in the imme- 
diate vicinity of the core. This is due to the maximal 
extensibility constraint, and the effect is not present in 
the calculation with the tails that do not satisfy the con- 
straint (see Fig. QJ. For sufficiently charged cores (ci > 2 
e/nm 2 ), it turns out to be better again to have smaller 
number of total charges (i.e. the T = 2 instead of T = 3 
capsid) that can approach the capsid easily without any 
geometrical constraint. 

It is obvious from the previous discussion that the finite 
extensibility of the tails is important for determination 
of the lowest energy structures. One can see this most 
easily by comparing Figs. 0] and [5] This effect becomes 
particularly important for structures in which the core 
is significantly smaller than the capsid, i.e. when (i?2 — 
Ri)/(N t a) > 1. 

To test the theory of our previous paragraph , we re- 
peated our calculations at lower salt concentrations. The 
results are presented in Fig. [10] As it is shown in the fig- 
ure the transition line from T=3 to T="2" is moved to 
very high a as expected. 

In summary, we have demonstrated that the ther- 
modynamics of the assembly nontrivially depends on 
the electrostatic and geometric constraints which in- 
clude k(Ri — R2) (electrostatic screening), maximal pos- 
sible stretching (R2 — Ri)/{N t a) ~ 1, and confinement 
(i?2 — R\)/a < 1 of the protein tails. Our study pro- 
vides quantitative guidelines for experiments aiming to 
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FIG. 10: Regions in the Ri — o\ space in which a particular 
T-number structure has the lowest free energy, as denoted. 
Panels (a) and (b) show the results for monovalent concen- 
trations of Co = 100 mM and Co = 10 mM, respectively. The 
two vertical dash-dotted lines denote the radii of T =1, and 
"2" structures (R 2 ). 



assemble "hybrid" structures, i.e. protein shells around 
charged and impenetrable cores, especially in the limit 
when the number of cores is much smaller than the num- 
ber of proteins. 
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Appendix: Derivation of Debye-Hiickel formulas 

We solve the linearized Poisson-Boltzman equation for 
the system with tailless capsomeres. The electrostatic 
potential can be written in region I (see Fig. [T} as 



*/(r) = C, 



in region II as 



(24) 



$ Jj(r)=A fEE(z^ +B f^) ) (25 ) 



c 



Rioi + R 2 o 2 e <Rl - R2) 



(29) 



and 



-K-R.2 



D 



2eoe r K(l + kR\) 



{kR x - l)R 2 a 2 e 2KRl 

(1 + KR 1 )R 2 <j 2 e 2KR2 ] .(30) 



and in region III as 



$m(r) = D 



exp(— nr) 



(26) 



where k is the Debye-Hiickel screening length, and 
A, B, C, and D are unknown constants that are to be 
determined from the two boundary conditions at R\ and 
two at R 2 . This yields 



A = 



B = 



2eoe r «;(l + kRi) 
R 2 o 2 e~^ 



(27) 
(28) 



The electrostatic free energy can be written as 



F = I d *r9m?) 



(31) 



which yields 



We n 2KR \ , x {AnRlR^^e^ 
nil + nRi) I 



7re 

e e r n{l + kRi) 
{kRi - \)R\u\e 2KRl 

[2K,R\al + (1 + KR^Rla^e 2 ^ 2 } (32) 
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